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1.0 Introduction 


1.1.1 This specification is intended to define a cross-platform, 
interoperable file storage and transfer format. Since its 
first publication in 1989, PKWARE, Inc. ("PKWARE") has remained 
committed to ensuring the interoperability of the .ZIP file 
format through periodic publication and maintenance of this 
specification. We trust that all .ZIP compatible vendors and 
application developers that use and benefit from this format 
will share and support this commitment to interoperability. 


1.2 Scope 


1.2.1 ZIP is one of the most widely used compressed file formats. It is 
universally used to aggregate, compress, and encrypt files into a single 
interoperable container. No specific use or application need is 

defined by this format and no specific implementation guidance is 
provided. This document provides details on the storage format for 
creating ZIP files. Information is provided on the records and 

fields that describe what a ZIP file is. 


1.3 Trademarks 


1.3.1 PKWARE, PKZIP, Smartcrypt, SecureZIP, and PKSFX are registered 
trademarks of PKWARE, Inc. in the United States and elsewhere. 
PKPatchMaker, Deflate64, and ZIP64 are trademarks of PKWARE, Inc. 
Other marks referenced within this document appear for identification 
purposes only and are the property of their respective owners. 


1.4 Permitted Use 


Ішго- 21 


LATEST RELEASES: Zip 3.00 was released оп 7 July 2008. WiZ 5.03 was released оп 11 March 2005. UnZip 6.0 was 
released on 29 April 2009. MacZip 1.06 was released оп 22 February 2001. See the Zip, UnZip and WiZ pages for current 
status and download locations. 


In addition, a new set of discussion forums was set up in October 2007. These replace the older QuickTopic forum, which 
was overrun by spam. (The spam postings have since been deleted, but further posts to the old forum are permanently 
disabled.) 


About Info-ZIP 


Info-ZIP is a diverse, Internet-based workgroup of about 20 primary authors and over one hundred beta-testers, formed in 1990 as a 
mailing list hosted by Keith Petersen on the original SimTel site at the White Sands Missile Range in New Mexico. 


Info-ZIP's purpose is to provide free, portable, high-quality versions of the Zip and UnZip compressor-archiver utilities that are 
compatible with Фе DOS-based PKZIP by PK WARE, Inc. 


Info-ZIP supports hardware from microcomputers all the way up to Cray supercomputers, running on almost all versions of Unix, VMS, 
OS/2, Windows 9x/NT/etc. (a.k.a. Win32), Windows 3.x, Windows CE, MS-DOS, AmigaDOS, Atari TOS, Acorn RISC OS, BeOS, Мас 
OS, SMS/QDOS, MVS and OS/390 OE, VM/CMS, FlexOS, Tandem NSK and Human68K (Japanese). There is also some (old) support 
for LynxOS, TOPS-20, AOS/VS and Novell NLMs. Shared libraries (DLLs) are available for Unix, OS/2, Win32 and Win16, and 
graphical interfaces are available for Win32, Win16, WinCE and Мас OS. 


Info-ZIP code has been incorporated into a number of third-party products as well, both commercial and freeware. Some of the more 
interesting ones (well, historically speaking) include the use of UnZip code in Фе unzip.dll distributed with IBM's OS/2 Warp 
BonusPak and WebExplorer, as part of the reinstallation code for the IBM Aptivas preloaded with OS/2 Warp, and as part of IBM's 
Infoprint product. Sun used Info-ZIP's self-extractor to distribute Ше NT version of their HotJava browser, Novell uses UnZip for 
NetWare 6 installation, and SAP includes it in Business One. Various Windows products such as WinZip and the DynaZIP DLLs 
incorporate Info-ZIP code, too. And let us not forget Pretty Good Privacy (PGP), an excellent encryption program that uses Info-ZIP 
code as a first step in encrypting files. Info-ZIP's primary compression engine has also been spun off into the free zlib compression 
library, used in Netscape/Mozilla/Firefox, the Linux kernel, Windows, Java, virtually all PNG-supporting software, and countless other 
products. 


Info-ZIP can be reached by a web-based form, but you'll have to read our Frequently Asked Questions page to find out how. Our two 
primary web sites are hosted by our very own Hunter Goatley and by the most excellent SourceForge. Secondary distribution sites are 
hosted by the Comprehensive TeX Archive Network. 


A Known Plaintext Attack on the PKZIP 
Stream Cipher 


Eli Biham* Paul С. Kocher** 


Abstract. The PKZIP program is one of the more widely used archive/ 
compression programs on personal computers. It also has many compat- 
ible variants on other computers, and is used by most BBS’s and ftp 
sites to compress their archives. PKZIP provides a stream cipher which 
allows users to scramble files with variable length keys (passwords). 

In this paper we describe a known plaintext attack on this cipher, which 
can find the internal representation of the key within a few hours on 
a personal computer using a few hundred bytes of known plaintext. In 
many cases, the actual user keys can also be found from the internal 
representation. We conclude that the PKZIP cipher is weak, and should 
not be used to protect valuable data. 


1 Introduction 


Тће PKZIP program is one of the more widely used archive/compression pro- 
grams on personal computers. It also has many compatible variants on other 
computers (such as Infozip's zip/unzip), and is used by most BBS's and ftp sites 
to compress their archives. PKZIP provides a stream cipher which allows users to 
scramble the archived files under variable length keys (passwords). This stream 
cipher was designed by Roger Schlafly. 

In this paper we describe a known plaintext attack on the PKZIP stream 
cipher which takes a few hours on a personal computer and requires about 13- 
40 (compressed) known plaintext bytes, or the first 30-200 uncompressed bytes, 
when the file is compressed. The attack primarily finds the 96-bit internal rep- 
resentation of the key, which suffices to decrypt the whole file and any other file 
encrypted under the same key. Later, the original key can be constructed. This 
attack was used to find the key of the PKZIP contest. 

'The analysis in this paper holds to both versions of PKZIP: version 1.10 and 
version 2.04g. The ciphers used in the two versions differ in minor details, which 
does not affect the analysis. 

'The structure of this paper is as follows: Section 2 describes PKZIP and the 
PKZIP stream cipher. The attack is described in Section 3, and à summary of 
the results is given in Section 4. 


* Computer Science Department, Technion - Israel Institute of Technology, Haifa 
32000, Israel 

** Independent cryptographic consultant, 7700 N.W. Ridgewood Dr., Corvallis, OR 
97330, USA 
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return (int)(((temp ж (temp ^ 1)) >> 8) & Oxff); 
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Abstract. Biham and Kocher demonstrated that the PKZIP stream ci- 
pher was weak and presented an attack requiring thirteen bytes of plain- 
text. The deflate algorithm “zippers” now use to compress the plaintext 
before encryption makes it difficult to get known plaintext. We consi- 
der the problem of reducing the amount of known plaintext by finding 
other ways to filter key guesses. In most cases we can reduce the amo- 
unt of known plaintext from the archived file to two or three bytes, 
depending on the zipper used and the number of files in the archive. 
For the most popular zippers on the Internet, there is a fast attack 
that does not require any information about the files in the archive; 
instead, it gets doubly-encrypted plaintext by exploiting a weakness in 
the pseudorandom-number generator. 


1 Introduction 


PKZIP is a compression / archival program created by Phil Katz. Katz had the 
foresight to document his file format completely in the file APPNOTE.TXT, 
distributed with every copy of PKZIP; there are now literally hundreds of “zip- 
per” programs available, and the ZIP file format has become a de facto standard 
on the Internet. 

In [BK94] Biham and Kocher demonstrated that the PKZIP stream cipher 
was weak and presented an attack requiring thirteen bytes of plaintext. Eight 
bytes of the plaintext must be contiguous, and all of the bytes must be the 
text that was encrypted, which is usually compressed data. [K92] shows that 
the compression method used at the time, implode, produces many predictable 
bytes suitable for mounting the attack. 

Most zippers available today implement only one of the compression methods 
defined in APPNOTE.TXT, called deflate. Deflate uses Huffman coding followed 
by a variant of Lempel-Ziv. Once the dictionary reaches a certain size, the process 
starts over. Since the Huffman codes for any of the data depend on a great deal of 
surrounding data, one is forced to guess the plaintext unless one has the original 
data. The difficulty of getting known plaintext was one reason Phil Zimmerman 
decided to use deflate in PGP [PGP98]. Practically speaking, if one has enough 
of the original file to get the thirteen bytes of plaintext required for the attack 
in [BK94], one has enough to break the encryption almost instantly. 


M. Matsui (Ed.): FSE 2001, LNCS 2355, pp. 125-134, 2002. 
© Springer-Verlag Berlin Heidelberg 2002 
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> Compare work done this year to find 
hash collisions in SHA-1, a 128-bit hash 
function, which requires 2264 hashes and 
cost around $100K of specialized 
hardware and electricity costs. 


| can say that | could spend so much on 
this archive. In any case, this is less than 
trying out a password from a bunch of 
characters or brute force all 3 zip keys. 
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Current: 16 + 4 carry bits = 20 
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Breaking Truncated Linear Congruential Generator with known parameters 


Asked 2 years, 3 months ago Active 2 years, 3 months ago Viewed 328 times 


There is an elaborate discussion on the breaking of TLCG on the link below, where they show how 
to break the generator with known parameters given the most significant bits. Problem with LLL 
reduction on truncated LCG schemes 


| tried to apply the same principles when given the least significant bits but with no success. On the 
paper by Frieze et al they discuss it briefly and mention substituting *x = 2s0*x(1) + x(2)* that helps 
a little bit but | cant figure out what the value of 50 is supposed to be. Is the anyone who can help.? 


cryptanalysis | lattice-crypto 


share improve this question follow edited Apr 28 18 at 12:28 asked Apr 28 '18 at 10:57 


2 | Paul Uszak = Norman W 
1 ж = x 21 £2 
Is that substitution formula correct mathematics? — Paul Uszak Apr 28 '18 at 12:31 


1 Could you state the givens in the problem at hand? Is the modulus prime, a power of two, other? – fgrieu $ 
Apr 28 18 at 12:55 


1 Answer Active | Oldest | Votes 


According to the paper Freize et atll, for the most significant bit case which is discussed at length in 
the link,the modulas M must be odd, the addend | assume сап be any number between 1 and the 
modulus, and the incriment must be zero. So given a*x(i) + b ~ mod MM is odd 0 >a <Mb = 0 
Because both a and M are known n we obtain high order bits у(ђ. Then x = y + z Where у is the 
high order bits n z is the lower order bits. 


Lx ^ 0 mod M Taking B the reduced basis of yields. Bx ~ 0 mod M Substituting x = y + 2 gives B x + 
By ~ 0 mod M which yields the equation Bx + By = km for an unknown vector k of integers. 


From this point its easy to find the lower order bits z, (see the link above). 


In the case where we are give the lower significant bits z instead of the higher bits y, the paper 
suggests substituting x = 2(s0)* у + z. And they talk about finding the inverse of 2(s0) mod M which 
is guaranteed to exist since M is odd. But that's where | get completely lost. 

share improve this answer follow answered Apr 28 18 at 17:08 
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tlcg4.sage 

М = 2432 

с = 0x08088405 

L = matrixC[ 
[ M, 0, 0, Q0], 
[cA1, -1, 0, 0), 
[cA2, 0, -1, Q0], 
[c^3, 0, 0, -1] 

]) 

B = L.LLLO 

5122 = 4 


k1@ = randint(®, М) 

ks = | c^Cn + 1) * k10 % М for n in range(size) | 
print "ks: " 

print mapChex, ks) 

msbs = [Ck € Oxff000000) for К in ks] 

secret = [ks[i] - msbs[i] for 1 in range(size) | 

w1 = В * vector(msbs) 

w2 = vector([ round(RR(w) / M) * М - м for м in м1 |) 
guess = list(B.solve_right(w2)) 

print "guess: " 

# print [hex(Integer(guess[i])) for i in range(size) | 
print guess 


print "diff from msb + guess: 
Я print [hexCInteger(ks[i] - msbs[1] - guess[i])) for i in 
range(size)] 

print vector(ks) - vector(msbs) - vector(guess) 
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mitm_stagel.cpp 2020-07-31 
} 


void write. stage1. candidate. file(FILE *f, 
const vector«stagel candidate» &candidates, 
const size t start idx, const size t num) 1 
fprintf(stderr, 
"write stagel candidate file: writing %ld candidates 
"out of %ld to file starting at index %ld.\n", 
num, candidates.size(), start idx); 


write word(f, num); 

auto end idx = start idx + num; 

for (size t i = start idx; i « end idx; ++1) 1 
write stagel candidate(f, candidates[i]); 


// info: the info about the archive to attack 
// table: vector<vector<stagela>> table(0x01000000) 
void mitm stagela(archive info &info, vector<vector<stagela>> &table, 
correct guess *c) 1 
// STAGE 1 
// 
// Guess 50, chunk2, chunk3 and carry bits. 
uint8 t xf® = info.file[0].x[0]; 
uint8 t xf1 = info.file[1].x[0]; 
uint32 t extra(®); 


for (uint16_t 50 = 0; 50 < 0х100; ++50) 1 
fprintf(stderr, "%@2x ", 50); 
if ((50 4 дхб) == Oxf) 1 
fprintf(stderr, "\n"); 
} 
for (uint16_t chunk2 = 0; chunk2 < 0x100; ++chunk2) 4 
for (uint16_t chunk3 = 0; chunk3 < 0x100; ++chunk3) 1 
for (uint8_t carries = 0; carries < 0х10; ++carries) í 
if (nullptr != с 88 50 == c->sx[0][0] 88 
chunk2 == c->chunk2 && chunk3 == c->chunk3 && 
carries == (c->carries >> 12)) { 
fprintf(stderr, "On correct guess.\n"); 
} 
uint8 t carryxf® = carries & 1; 
uint8_t carryyf0 = (carries >> 1) & 1; 
uint8_t саггух 1 = (carries >> 2) & 1; 
uint8 t carryyfl = (carries >> 3) & 1; 
uint32_t upper = 0x01000000; // exclusive 
uint32_t lower = 0x00000000; // inclusive 
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uint32_t kOcrc 
uint32_t extra 
uint8 t msbxf® 
first half step(xf0, false, chunk3, carryxf®, Кдсгс, 
extra, upper, lower); 
uint8. t yf@ = xf0 ^ 50; 
кдсгс = chunk2; 
extra = 0; 
uint8 t msbyf@ = 
first_half_step(yf0, false, chunk3, carryyf®, k@crc, 
extra, upper, lower); 
if Cupper « lower) 1 
if (nullptr != с 88 50 == c->sx[0][0] 88 
chunk2 == c->chunk2 44 chunk3 == c->chunk3 44 
carries == (c->carries >> 12)) 1 
fprintf(stderr, 
"Failed to get correct guess: 50 = %02х, 
"chunk2 = %02x, " 
"chunk3 = " 
"%02х, carries = %x\n", 
50, chunk2, chunk3, carries); 


chunk2 ; 
д; 


+ 

continue; 
+ 
кдсгс = chunk2; 
extra = 0; 


uint8 t msbxf1 = 
first half step(xf1, false, chunk3, carryxf1, k@crc, 
extra, upper, lower); 
if (upper < lower) í 
if (nullptr != с && 50 == c-»sx[0][0] && 
chunk2 == c->chunk2 && chunk3 == c->chunk3 && 
carries == (c->carries >> 12)) í 
fprintf(stderr, 
"Failed to get correct guess: 50 = %02х, 
"chunk2 = %@2x, " 
"chunk3 = " 
"%02х, carries = %x\n", 
s@, chunk2, chunk3, carries); 


+ 
continue; 
+ 
uint8 t yfl = xfl ^ 50; 
кдсгс = chunk2; 
extra = 0; 
uint8_t msbyf1 = 


first half step(yfi1, false, chunk3, carryyf1, Кдсгс, 
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extra, upper, lower); 
if (upper < lower) 1 
if (nullptr != с && 50 == c-»sx[0][0] && 
chunk2 == c->chunk2 && chunk3 == c->chunk3 && 
carries == (c->carries >> 12)) í 
fprintf(stderr, 
"Falled to get correct guess: 50 = %02х, 
"chunk2 = %@2x, " 
"chunk3 = " 
"%02х, carries = %x\n", 
s@, chunk2, chunk3, carries); 


T 
continue; 
+ 
uint32 t mk = toMapKey(msbxf®, msbyf@, msbxf1, msbyf1); 
if (nullptr != с 88 50 == c-»sx[0][0] && 
chunk2 == c->chunk2 44 chunk3 == c->chunk3 88 
carries == (c->carries >> 12)) 1 
fprintf(stderr, 
"MSBs: %02х, %02х, %02х, %02х, Маркеу: %08х, " 
"carries: %х, " 
"c.carries: %04x\n", 
msbxf@, msbyf@, msbxf1, msbyf1, mk, carries, 
c->carries); 
} 
stagela candidate = íuint8_t(s0), uint8 t(chunk2), 
uint8 t(chunk3), carries, msbxfQ}; 
table[mk].push. backCcandidate); 


// info: the info about the archive to attack 
// table: the output of mitm_stagela 
// candidates: an empty vector 
void mitm stagelb(const archive info &info, 
const vector<vector<stagela>> &table, 
vector«stage1 candidate» &candidates, const correct guess "С, 
size t *correct candidate index) í 
// Second half of MITM for stage 1 
bool found correct - false; 
for (uint16_t s1xf0 = 0; s1xf0 < 0х100; ++s1xf0) 1 
for (Cuint8 t prefix = 0; prefix < 0x40; ++prefix) 1 
uint16_t pxfOCpreimages[s1xf0][prefix]); 
if (nullptr != c && s1xf0 == c->sx[@][1]) í 
fprintf(stderr, "s1xf0: %02x, prefix: %04х ", 51хҒ0, pxf0); 
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} 


if ((ргеҒіх & 3) == 3) í 
fprintf(stderr, "\n"); 
} 


vector«uint8 t» firsts(Q); 

uint8 t 51уҒ0 = s1xf0 ^ info.file[0].x[1] ^ info.file[0].h[1]; 
second half step(pxfO, 51уҒ0, firsts); 

if C!firsts.sizeQ) í 


) 


continue; 


for Cuint16_t s1xf1 = 0; 51хҒ1 < 0x100; ++s1xf1) 1 


vector«uint8 t» seconds(0); 
second half step(pxfO, slxfl, seconds); 
if C!seconds.size()) 1 
continue; 
+ 
vector«uint8 t» thirds(0); 
uint8 t slyf1 = s1xf1 ^ info.file[1].x[1] ^ info.file[1].h[1]; 
second half step(pxf0O, slyfl, thirds); 
if C!thirds.sizeQ) í 
continue; 
} 
for Cauto f : firsts) í 
for Cauto s : seconds) { 
for Cauto t : thirds) í 
uint32 t mapkeyCf | (s << 8) | СЕ << 162); 
for (stagela candidate : table[mapkey]) í 
stagel candidate g; 
g.chunk2 = candidate.chunk2; 
g.chunk3 = candidate.chunk3; 
g.cb1 = candidate.cb; 
9.т1 = 
Ccandidate.msbk11xf@ * 0x01010101) ^ mapkey; 


// Get ~4 possible solutions for lo24(k20) = 
// chunks 1 and 4 


// A B С D k20 

//^ E F GH crc32tab[D] 

// ---------- 

// I J KL crck20 

//A M N O P crc32tab[msbk11xf0] 

// ---------- 

// QRS T Cpxf0 << 2) matches К21хҒ0 


// Starting at the bottom, derive 15..2 of KL 
// from 15..2 of ST and OP 
uint16_t crck20 = 

COpxfO << 2) ^ 
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erc32tab[candidate.msbk11xf®]) € 
Oxf ffc; 


// Now starting at the top, iterate over 64 
// possibilities for 15..2 of CD 
for (uint8_t i = 0; i < 64; Hi) 1 
uint32 t maybek2® = 
Cpreimages[candidate.s@][i] << 2); 
// and 4 possibilities for low two bits of D 
for Cuint8 t lo = 0; lo < 4; ++lo) + 


// CD 
maybek20 = (maybek20 & Oxfffc) | 10; 
ИИ 1" = САН 


uint8 t match - 
(maybek20 >> 8) ^ 
crc32tab[maybek20 € Oxff]; 
// If upper six bits of L == upper six 
// of |" then we have a candidate 
if ((match 4 Oxfc) == (crck20 & @xfc)) í 
// KL А GH = ВС. (В = BC >> 8) 8 
// Oxff. 
uint8 t b = 
CCcrck20 ^ 
crc32tab[maybek20 € Oxff]) >> 
8) & 
Oxf Ff ; 


if (а.К20 count >= g.MAX K20S) 1 
fprintf(stderr, 
"Not enough space for 
"k20 candidate in " 
"stagel_candidate.\n"); 
abort(); 


} 


// BCD = (B << 16) | CD 
g.maybek20[g.Kk20. count] = 

(b << 16) | maybek20; 
g.k20 count += 1; 


) 


if (0 == g.k20 count) 1 
continue; 


) 


candidates.push_back(g); 
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if (nullptr != c 88 s1xf0 == c-»sx[0][1] 88 
s1xf1 == c->sx[1][1] && 
candidate.s® == c->sx[0][0] 88 
candidate.chunk2 == c-»chunk2 && 
candidate.chunk3 == c-»chunk3 && 


candidate.cb == (c-»carries >> 12)) 1 
found_correct = true; 
fprintf(stderr, 


"Correct candidates index = %Lx\n", 
candidates.size() - 1); 
if Cnullptr != correct_candidate_index) 1 
*correct_candidate_index = 
candidates.size() - 1; 


} 


if Cc != nullptr 88 !found correct) + 
fprintf(stderr, 
"Failed to use correct guess: s1xf0 = %02х, slxfl = %@2x\n", 
c->sx[0][1], c->sx[1][1]); 
+ 
fprintf(stderr, "Stage 1 candidates.size() == %@41x\n", candidates.size()); 
+ 


+; // namespace mitm_stagel 
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__global__ void gpu_stage3_kernel(const gpu_stage2_candidate *candidates, 
keys *results, 
const archive_info* archive, 
const uint32_t stage2_candidate_count, 
const mitm::correct guess& с) í 
int i = threadIdx.x + blockDim.x * blockIdx.x;- 
if (1 < stage2 candidate count) í 
keys result = 10, 0, 0}; 
stage3::gpu stage3 internal(*archive, candidates[i], &result, &c); 
results[i].crck00 = result.crck00; 
results[i].k10 = result.k10; 
results[i].k20 = result.k20; 
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